Overview

Dataset statistics

Number of variables7
Number of observations217885
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.6 MiB
Average record size in memory56.0 B

Variable types

Numeric7

Warnings

df_index has unique values Unique
Vacancy_Rate% has 4712 (2.2%) zeros Zeros

Reproduction

Analysis started2021-02-23 16:16:34.871326
Analysis finished2021-02-23 16:18:59.053777
Duration2 minutes and 24.18 seconds
Software versionpandas-profiling v2.10.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct217885
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean134658.4531
Minimum1
Maximum264959
Zeros0
Zeros (%)0.0%
Memory size1.7 MiB
2021-02-23T11:18:59.321170image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile13534.2
Q169303
median136264
Q3200994
95-th percentile252198.8
Maximum264959
Range264958
Interquartile range (IQR)131691

Descriptive statistics

Standard deviation76336.12375
Coefficient of variation (CV)0.5668869793
Kurtosis-1.193004004
Mean134658.4531
Median Absolute Deviation (MAD)65859
Skewness-0.0407734205
Sum2.934005706 × 1010
Variance5827203789
MonotocityStrictly increasing
2021-02-23T11:18:59.514998image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20471
 
< 0.1%
1643571
 
< 0.1%
1786901
 
< 0.1%
1725451
 
< 0.1%
1745921
 
< 0.1%
2626131
 
< 0.1%
863991
 
< 0.1%
884461
 
< 0.1%
823011
 
< 0.1%
843481
 
< 0.1%
Other values (217875)217875
> 99.9%
ValueCountFrequency (%)
11
< 0.1%
21
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
ValueCountFrequency (%)
2649591
< 0.1%
2649581
< 0.1%
2649571
< 0.1%
2649561
< 0.1%
2649551
< 0.1%

RentPrice
Real number (ℝ≥0)

Distinct145391
Distinct (%)66.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1095.392952
Minimum19.96
Maximum5620.32
Zeros0
Zeros (%)0.0%
Memory size1.7 MiB
2021-02-23T11:18:59.761319image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum19.96
5-th percentile608.8712
Q1804.206
median966.296
Q31236.036
95-th percentile1994.636
Maximum5620.32
Range5600.36
Interquartile range (IQR)431.83

Descriptive statistics

Standard deviation493.4443644
Coefficient of variation (CV)0.4504724663
Kurtosis12.9915483
Mean1095.392952
Median Absolute Deviation (MAD)195.706
Skewness2.763417001
Sum238669693.2
Variance243487.3408
MonotocityNot monotonic
2021-02-23T11:18:59.927475image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1281.736363
 
0.2%
731.736347
 
0.2%
681.736335
 
0.2%
1006.736313
 
0.1%
631.736307
 
0.1%
831.736291
 
0.1%
781.736285
 
0.1%
1106.736258
 
0.1%
581.736244
 
0.1%
881.736240
 
0.1%
Other values (145381)214902
98.6%
ValueCountFrequency (%)
19.964
< 0.1%
94.964
< 0.1%
103.291
 
< 0.1%
139.41
 
< 0.1%
144.966
< 0.1%
ValueCountFrequency (%)
5620.322
 
< 0.1%
5619.7951
 
< 0.1%
5616.463
 
< 0.1%
5563.031
 
< 0.1%
5558.206102
< 0.1%

SizeRank
Real number (ℝ≥0)

Distinct11050
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14615.05118
Minimum0
Maximum34430
Zeros8
Zeros (%)< 0.1%
Memory size1.7 MiB
2021-02-23T11:19:00.114497image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1379.2
Q16938
median14027
Q321854
95-th percentile29685
Maximum34430
Range34430
Interquartile range (IQR)14916

Descriptive statistics

Standard deviation8955.875344
Coefficient of variation (CV)0.6127843984
Kurtosis-1.047220924
Mean14615.05118
Median Absolute Deviation (MAD)7426
Skewness0.1963601593
Sum3184400426
Variance80207703.18
MonotocityNot monotonic
2021-02-23T11:19:00.285373image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
29964230
 
0.1%
28097201
 
0.1%
30545200
 
0.1%
24938186
 
0.1%
27522185
 
0.1%
27365184
 
0.1%
25092183
 
0.1%
32062179
 
0.1%
29685178
 
0.1%
29401177
 
0.1%
Other values (11040)215982
99.1%
ValueCountFrequency (%)
08
< 0.1%
18
< 0.1%
28
< 0.1%
38
< 0.1%
48
< 0.1%
ValueCountFrequency (%)
3443027
 
< 0.1%
3432275
< 0.1%
3430216
 
< 0.1%
342582
 
< 0.1%
342472
 
< 0.1%

HomePrice
Real number (ℝ≥0)

Distinct210975
Distinct (%)96.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean185450.5809
Minimum10956.33
Maximum6141945.92
Zeros0
Zeros (%)0.0%
Memory size1.7 MiB
2021-02-23T11:19:00.570392image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum10956.33
5-th percentile50771.966
Q188016.67
median134667.5
Q3214877
95-th percentile484188.002
Maximum6141945.92
Range6130989.59
Interquartile range (IQR)126860.33

Descriptive statistics

Standard deviation185121.166
Coefficient of variation (CV)0.9982237056
Kurtosis60.3893833
Mean185450.5809
Median Absolute Deviation (MAD)55752.83
Skewness5.483810985
Sum4.040689981 × 1010
Variance3.426984612 × 1010
MonotocityNot monotonic
2021-02-23T11:19:00.746478image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
57537.174
 
< 0.1%
110771.674
 
< 0.1%
54673.674
 
< 0.1%
75169.834
 
< 0.1%
812724
 
< 0.1%
886294
 
< 0.1%
236968.253
 
< 0.1%
85790.423
 
< 0.1%
85846.333
 
< 0.1%
109707.53
 
< 0.1%
Other values (210965)217849
> 99.9%
ValueCountFrequency (%)
10956.331
< 0.1%
116881
< 0.1%
11860.831
< 0.1%
12041.421
< 0.1%
12062.831
< 0.1%
ValueCountFrequency (%)
6141945.921
< 0.1%
5373670.921
< 0.1%
4928414.671
< 0.1%
4522642.081
< 0.1%
42609751
< 0.1%

Vacancy_Rate%
Real number (ℝ≥0)

ZEROS

Distinct164679
Distinct (%)75.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.17859668
Minimum0
Maximum99.83974359
Zeros4712
Zeros (%)2.2%
Memory size1.7 MiB
2021-02-23T11:19:01.010951image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2.820812062
Q17
median12.02058775
Q320.33293698
95-th percentile45.93904196
Maximum99.83974359
Range99.83974359
Interquartile range (IQR)13.33293698

Descriptive statistics

Standard deviation14.00675651
Coefficient of variation (CV)0.865758433
Kurtosis4.84757007
Mean16.17859668
Median Absolute Deviation (MAD)5.952818947
Skewness2.002355621
Sum3525073.538
Variance196.1892279
MonotocityNot monotonic
2021-02-23T11:19:01.196383image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
04712
 
2.2%
20179
 
0.1%
25158
 
0.1%
16.66666667158
 
0.1%
14.28571429153
 
0.1%
11.11111111129
 
0.1%
12.5123
 
0.1%
33.33333333118
 
0.1%
10113
 
0.1%
8.33333333396
 
< 0.1%
Other values (164669)211946
97.3%
ValueCountFrequency (%)
04712
2.2%
0.022727272731
 
< 0.1%
0.11148272021
 
< 0.1%
0.12484394511
 
< 0.1%
0.14025245441
 
< 0.1%
ValueCountFrequency (%)
99.839743591
< 0.1%
99.653379551
< 0.1%
99.573863641
< 0.1%
99.395770391
< 0.1%
99.270072991
< 0.1%

Zipcode_2
Real number (ℝ≥0)

Distinct99
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean48.33178971
Minimum0
Maximum99
Zeros51
Zeros (%)< 0.1%
Memory size1.7 MiB
2021-02-23T11:19:01.378367image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile6
Q126
median48
Q371
95-th percentile95
Maximum99
Range99
Interquartile range (IQR)45

Descriptive statistics

Standard deviation27.47999887
Coefficient of variation (CV)0.5685698593
Kurtosis-1.05133683
Mean48.33178971
Median Absolute Deviation (MAD)23
Skewness0.09513969524
Sum10530772
Variance755.1503381
MonotocityNot monotonic
2021-02-23T11:19:01.559758image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
493588
 
1.6%
483541
 
1.6%
123525
 
1.6%
953459
 
1.6%
563421
 
1.6%
543393
 
1.6%
283278
 
1.5%
613229
 
1.5%
983223
 
1.5%
453137
 
1.4%
Other values (89)184091
84.5%
ValueCountFrequency (%)
051
 
< 0.1%
12227
1.0%
22246
1.0%
31812
0.8%
42616
1.2%
ValueCountFrequency (%)
991423
0.7%
983223
1.5%
972879
1.3%
961386
0.6%
953459
1.6%

Zipcode_3
Real number (ℝ≥0)

Distinct887
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean487.4041306
Minimum6
Maximum999
Zeros0
Zeros (%)0.0%
Memory size1.7 MiB
2021-02-23T11:19:01.745981image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile60
Q1262
median483
Q3717
95-th percentile954
Maximum999
Range993
Interquartile range (IQR)455

Descriptive statistics

Standard deviation274.6330887
Coefficient of variation (CV)0.563460733
Kurtosis-1.051258348
Mean487.4041306
Median Absolute Deviation (MAD)228
Skewness0.09548093925
Sum106198049
Variance75423.33342
MonotocityNot monotonic
2021-02-23T11:19:01.932747image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
770768
 
0.4%
553688
 
0.3%
170627
 
0.3%
70626
 
0.3%
535618
 
0.3%
945612
 
0.3%
604612
 
0.3%
730610
 
0.3%
458608
 
0.3%
956608
 
0.3%
Other values (877)211508
97.1%
ValueCountFrequency (%)
632
 
< 0.1%
717
 
< 0.1%
92
 
< 0.1%
10440
0.2%
1196
 
< 0.1%
ValueCountFrequency (%)
9999
 
< 0.1%
99838
 
< 0.1%
99742
 
< 0.1%
996145
0.1%
995147
0.1%

Interactions

2021-02-23T11:16:46.843356image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:16:47.182592image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:16:47.471738image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:16:47.707633image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:16:48.004031image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:16:48.856310image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:16:54.750951image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:16:54.976818image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:16:55.211384image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:16:55.444752image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:16:55.724652image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:16:56.786226image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:02.852528image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:03.095915image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:03.327003image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:03.575689image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:03.893729image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:04.783635image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:11.474508image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:11.731398image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:11.966920image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:12.209598image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:12.498575image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:13.452190image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:20.209976image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:20.517785image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:20.747508image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:20.978934image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:21.267446image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:22.137434image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:28.369297image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:30.029766image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:31.688024image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:33.346373image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:35.007384image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:36.714867image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:44.146544image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:17:58.374933image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:18:09.951741image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:18:21.448130image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:18:33.911469image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-23T11:18:46.018023image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-02-23T11:19:02.076171image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-02-23T11:19:02.259386image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-02-23T11:19:02.430177image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-02-23T11:19:02.609600image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-02-23T11:18:58.112681image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-02-23T11:18:58.482998image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexRentPriceSizeRankHomePriceVacancy_Rate%Zipcode_2Zipcode_3
011311.07611179.0274920.173.11634302023
121484.6268621.0415097.504.46464602023
241524.0069640.0247510.423.73290102023
351310.0165289.0264492.507.96025602023
461307.7369579.0309743.6711.56596802023
571399.9267293.0279614.925.45512202023
681753.9569084.0371979.422.84992002023
7101412.9367427.0316128.334.69011702023
8111551.496264.0302772.0815.66614302023
9121850.2868710.0335381.759.36887502023

Last rows

df_indexRentPriceSizeRankHomePriceVacancy_Rate%Zipcode_2Zipcode_3
2178752649471846.536680.0720786.925.02449498981
2178762649481840.942689.0766981.504.88587298981
2178772649491591.401122.0592306.586.73735498981
2178782649501909.5828159.0438970.0013.58024798981
2178792649531413.897640.0317426.754.85376598982
2178802649551059.8723400.0552805.4251.21951298982
217881264956993.8525265.0678499.0051.32924398982
2178822649571533.504981.0314320.836.54016298983
217883264958778.9926185.0150193.1728.53773698983
2178842649591840.866759.0535136.757.34007798983